Lsp Speech Synthesis Using Backpropagation Networks
نویسنده
چکیده
A multi-layer perceptron (MLP) similar to that used in the NETtalk system is used to form a mapping between sequences of allophones and corresponding frames of LPC synthesizer control parameters. Three parameter sets equivalent to the LPC coe cients, line spectral pair (LSP), PARCOR and log area ratio, are evaluated. In addition to training a standard MLP, networks which have been decomposed according to phonetic class and by allophone, are trained. Decomposition is found to reduce training time and produce greater accuracy on the training set, however the network decomposed by allophone is found to receive to few training patterns too generalise properly on new data.
منابع مشابه
The Use of Vector Quantization in Neural Speech Synthesis
Our previous work has indicated that multilayer perceptrons (MLPs) trained using the backpropagation (BP) algorithm, have great difculty in learning continuous mappings with su cient accuracy for speech synthesis. The use of vector quantization allows networks to be trained to select a sequence of entries from a codebook of speech parameter vectors. For the network to be able to generalise mean...
متن کاملHistogram-based spectral equalization for HMM-based speech synthesis using mel-LSP
This paper describes a statistical spectral parameter emphasis technique for HMM-based speech synthesis using mel-scaled line spectral pair (mel-LSP). Spectral parameter emphasis is effective for compensating over-smoothed spectra in HMM-based speech synthesis. However, there is no conventional technique that satisfies such requirements as automatic tuning for different speakers and realtime sy...
متن کاملDiphone Synthesis Using a Neural Network
A neural network is used to produce formant data for the Holmes parallel formant speech synthesizer [2] from an allophonic transcription of plain english text. This paper presents results obtained from training a back propagation neural network using speech generated by a conventional speech synthesizer. The network is able to learn to reproduce the basic form of formant transitions between all...
متن کاملVoice conversion based on RBF neural network
Recently, voice conversion has becoming the research hotspot, because of its widely application areas. However, the voice conversion technology is still immature. By the researching of existing voice conversion models, the voice conversion system based on the RBF neutral network was designed, and the system simulation was implemented. During conversion, the unvoiced speech was excluded and the ...
متن کاملGlove-Talk: a neural network interface between a data-glove and a speech synthesizer
To illustrate the potential of multilayer neural networks for adaptive interfaces, a VPL Data-Glove connected to a DECtalk speech synthesizer via five neural networks was used to implement a hand-gesture to speech system. Using minor variations of the standard backpropagation learning procedure, the complex mapping of hand movements to speech is learned using data obtained from a single ;speake...
متن کامل